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Abstract: This paper reviews recent developments of robust estimation 
in linear time series models, with short and long memory correlation struc- 
tures, in the presence of additive outliers. Based on the manuscripts Fajardo et al. 
(2009) and Levy-Leduc et al. (2011a), the emphasis in this paper is given 
in the following directions; the influence of additive outliers in the estima- 
tion of a time series, the asymptotic properties of a robust autocovariance 
function and a robust semiparamctric estimation method of the fractional 
parameter d in ARFIMA(p, d, q) models. Some simulations arc used to sup- 
port the use of the robust method when a time series has additive outliers. 
The invariance property of the estimators for the first difference in ARFIMA 
model with outliers is also discussed. In general, the robust long-memory 
estimator leads to be outlier resistant and is invariant to first differencing. 
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1. Introduction 

Let {X t } t £z be a stationary time series with spectral density that behaves like 

fxH ~ h{u) | lu \~ 2d , asw^O 

where the spectral density h{ui) is a nonvanishing and continuously differentiable 
function with bounded derivative for — 7r < lu < ir, and d < 0.5. 

A well-known stationary parametric model with the above spectral density 
is the ARFIMA(p, d, q) process, which is the solution of the equation 

X t - n = (1 - B)- d rit, with t e Z, (1.1) 

where rj t = gfm £ t is the ARMA(p, q) process, [i is the mean (here it is as- 
sumed that fx = 0), $(B) = 1 - ^- = i^B J , 6(B) = 1 - ELi 6 *^ and 
p and q are positive integers (Hosking 1981). $(2;) and Q(z), with a scalar 
z, are polynomials with all roots outside the unit circle and share no com- 
mon factors, d is the parameter that holds the memory of the process, that is, 
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when d € (—0.5,0.5) the ARFLMA(p, d, q) process is said to be invcrtible and 
stationary. Besides, for d ^ 0, its autocovariance decays at a hyperbolic rate 
(7(i) = 0(.r 1+2d )). For d = 0, d e (-0.5,0) or d € (0,0.5), the process is said 
to be short-memory, intermediate-memory or long-memory, respectively. The 
long- memory property is related to the behavior of the autocovariances, which 
are not absolutely summable and the spectral density becomes unbounded at 
zero frequency. In the intermediate-memory region, the autocovariances are ab- 
solutely summable and, consequently, the spectral density is bounded. 
The spectral density function of {X t }t^z is given by 



/x(w) is continuous except for cj = where it has a pole when d > 0. A recent 
review of the ARFIMA model and its properties can be found in Palma (2007) 
and Doukhan et al. (2003). 

Many estimators for the fractional parameter d in long-memory time series 
have already been proposed in the literature. Among them are the semiparamet- 
ric procedures, a group which includes a wide variety of estimators based on the 
Ordinary Least Square (OLS) method. These procedures require the use of the 
spectral density parameterized within a neighborhood of zero frequency. Some 
references on this subject include the works of Geweke & Porter-Hudak (1983), 
Reisen (1994) and Robinson (1995a), among others. An overview of long-range 
dependence processes can be found in Beran (1994) and Doukhan et al. (2003). 

Time series with outliers or atypical observations is quite common in any area 
of application. In the case where the data is time-dependent, several authors such 
as Ledolter (1989), Chang et al. (1988) and Chen & Liu (1993) have studied the 
effect of outliers in a time series that follows ARIMA models. In general, they 
have concluded that the parameter estimates of ARMA models become more 
biased when the data contains outliers. Similar conclusion is also observed when 
estimating the fractional parameter in ARFIMA models. The outliers cause a 
substantial bias in the differencing parameter (Fajardo et al. 2009). 

An autocovariance robust function was proposed by Ma & Genton (2000). 
The asymptotical properties of this function are studied by Levy-Leduc et al. 
(2011a). The results presented in Fajardo et al. (2009), Levy-Leduc et al. (2011a) 
and Levy-Leduc et al. (20116) are the motivations of this paper. The impact of 
outliers in the estimation of ARFIMA models under different context is here 
studied. The asymptotical properties of a robust autocovariance function is dis- 
cussed and some empirical examples are used to illustrate the usefulness of a 
robust fractional parameter estimator. The invariance property of the estima- 
tor to the first difference is also empirically studied. The outline of this papers 
is as follows: Section 2 discusses the model and the impact of the outliers in 
time series. Section 3 summarizes the main results related to the robust autoco- 
variance estimator given in Levy-Leduc et al. (2011a) and discusses the robust 
estimation of the fractional parameter in the ARFIMA model. Section 4 presents 
some empirical studies and an application is discussed in Section 5. Concluding 
remarks and future directions are given in Section 6. 



= f v (u) 2 sin 



) 



-i -2d 



, W € [-7T,7r]. 
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2. The impact of outliers in time series 



Suppose partial realization of {A t } te z. Hence, the periodogram 

function is defined as I x {uj) = (27rn) _1 | J2t=i Xte% 
d = in the ARFIMA model, 



It follows that, when 



I x (ui) = 27r/x(w) 



(2.1) 



where E[|i?(u;)| 2 ] = O(^) (£ > 0) is uniformly in uj G [— 7T,7r] (Theorem 6.2.2 in 
Priestley (1981)) and J e (-) is the periodogram of the residuals. From Equation 
1.1 and Theorem 6.1.1 in Priestley (1981), asymptotic sample properties of 
^(^,) are derived and they are summarized as follows. If {et} t6Z are normally 

distributed, for a fixed set of values of the Fourier frequencies uij = j = 
1, . . . , \ n/2\ , where |_ - J means the integer part, asymptotically the set of variables 

is independently distributed, each distributed as At w = and 7r, the 



distributions are Xi (f° r details see Priestley (1981)). These asymptotic results 

(1 + 5(uj)) as 



for the periodogram lead to E 
n — > oo, where 



1 and var 



fx {^j 



5(uij) = 1 if ujj = 0, tt and otherwise. 



(2.2) 



The above results establish the unbiascdness and inconsistency properties of 

Due to the singularity of fx(w) when d > 0, the standard results of the 
asymptotic distribution of the periodogram discussed previously can not be ap- 
plied to I x {uj) for small and fixed j. Hurvich & Bcltrao (1993) showed that 



limn-j.oo E 



depends on j and d, and exceeds unity for most d =/= 

and /x ^" fc - ) 



are corre- 



(Kiinsch (1986); Robinson (19956)). For j ^ k, j^0^ „ 
lated, and for a fixed value j and Gaussian processes, the limiting distribution 
of f^(ul\ i s n °t exponential (Robinson 19956). That is, under the Gaussian 
assumption, Hurvich & Bcltrao (1993) show that the normalized periodogram 
j^(2) ^ s as y m Ptotically distributed as the quadratic form 



ai 



"Xi 



a 2 



(2.3) 



where xi and X2 are variables with Chi-squared distribution with one degree of 
freedom, a Y = Lj{d) - 2L*(d), a 2 = L 3 {d) + 2L*(d), 



Lj{d) = lim E 



sin 2 (w/2) 

(27TJ - U,) 2 



-2,/ 



(2.4) 



and 



L*(d) = 



sin 2 (w/2) 



tt J-oo (27Tj - w)(2?rj + w) 



27TJ 



-2</ 



(2.5) 
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Let {Zt}t& be a process contaminated by additive outliers, which is described 

by 

m 

Z t = X t +Y, w o Y 3,u (2-6) 
j'=i 

where m is the maximum number of outliers; the unknown parameter uij rep- 
resents the magnitude of the jth outlier, and Yj t (= YA is a random vari- 
able (r.v.) with probability distribution Pr (Yj = —1) = Pr(Y} = 1) = ?f and 
Pr(Y 3 =0) = 1 - pj, where E[Yj] = and E[Y?] = var(Y,) = p r Model 2.6 
is based on the parametric models proposed by Fox (1972). Yj is the product 
of Bernoulli(pj) and Rademacher random variables; the latter equals 1 or —1, 
both with probability \. X t and Yj are independent random variables. 

Some results related to the effects of outliers on the spectral density and on 
the autocorrelation functions of {Z t }tei are presented as follows. 

PROPOSITION 2.1. Suppose that {Z t } t£ z follows Model 2.6. 

i. The autocovariance function (ACOVF) of {Z t }tt£Z is given by 

n l 

7.W = lx (h) + S'(h) ™)Vh 
i=i 

where lx (h) = E[X t X t+h ] ~ E[X t ]E[X t+h ], S'(h) 
with h G Z. 

ii. The spectral density function of {Z t } is given by 

fz (w) = /x (w) + — rofjJj , W G (-7T, tt] , 

DO 

where f x (u) = — £ 7 x(ft)e- ihw . 

^ h=-oo 

Proposition 2.1 states that j z (h), for /i = 0, depends on var(Y,). 7z(0) in- 
creases with var(Y}) (see the proof in Fajardo et al. (2009)). This relation be- 
tween i?z(0) and var(Yj) will certainly affect the model parameter estimates 
because it reduces the magnitude of the autocorrelations and introduces loss of 
information on the pattern of serial correlation (see also Chan (1992, 1995)) 

The spectral form of {Z t }tez (Model 2.6) when {A t } teZ follows an ARFIMA(p, d, q) 
model is given in the next lemma. 

LEMMA 2.1. Let {X t }tez be a stationary and invertible ARFIMA(p, d, q) 
process. Also, let {Z t }t<Ei, be such that Z t = X t + S J= i ^j^j) where m is the 
maximum number of outliers, the unknown parameter Wj is the magnitude of 
the jth outlier and Yj is a r.v. with probability distribution Pr (Yj = — 1) = 




1, when h = 0, 
0, otherwise. 
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Pr (Yj = 1) = ^ and Pr (Yy = 0) = 1 — Pj. The spectral density of {Z t } te i is 



/zM = — TT7 — i2sin( - ) [ + — > 
^ V ' 2tt $(e~ to ) 2 I V 2/ J 27r^ 



w 2 ; 



The proof of Lemma 2.1 follows directly from Proposition 2.1. 

The effects of an outlier on the sample autocovariance function and on the 
periodogram are given below. 

PROPOSITION 2.2. Let zi,z 2 , ...,z n be generated from Model 2.6 with one 
outlier, and let the outlier occur at time t = T with h < T < n — h. It follows 
that: 

i. The sample ACOVF is given by 

2 

%(h) = %{h) ± ^ T . h + x T+h - 2i) + ^'(h) + o p (n- 1 ), (2.7) 

^ n — h 

where %(h) = - (x t - x)(x t+ h - x). 
ii. The periodogram is given by 

I z (oS) = I x (oj) + A{w), wG(-7r,7r], 

1 1 

where I x {uj) — — Jx(h)e~' lhu ' , and 

2n h=-(n-l) 

A ^ = 2~™ ± ™ I ^ T + - 2x)cos(/iw) J> + o p (n" 1 ). 



These results show that outliers may substantially affect the inference per- 
formed on stationary models by revealing that there is information loss in the 
serial correlation dynamics of the process, which is translated into the parameter 
estimation process. 



3. The autocovariance and spectral density robust functions 
3.1. The autovariance function 

Ma & Genton (2000) proposed a scale covariance estimator which is based on 
Qn(0i defined in the sequel, and on the following covariance identity 

cov(X,Y) = -L[var(aX + bY) - var(aX - bY)], (3.1) 

where X and Y are random variables, a = , 1 and b = , 1 (Hubcr 
2004). 
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Rousseeuw & Croux (1993) proposed a robust scale estimator function Q n (-) 
which is based on the rth order statistic of (™) distances {l?^ — rjk\, j < fc}, and 
can be written as 



Qn(v) = c x {\Vj ~ Vk\;j < fc}( T ), (3.2) 
where ?/ = (771, 772, . . . , Tj n )\ c is a constant used to guarantee consistency (c 
2.2191 for the normal distribution) and r 



1. 



Based on identity (3.1) and on Q n (-), Ma & Genton (2000) proposed a highly 
robust estimator for the ACOVF: 



% (h) = \ [Q*_ h (u + v) - Q 2 n _ h {n - v)] 



(3.3) 



where u and v are vectors containing the initial n — h and the final n — h 
observations, respectively. The robust estimator for the autocorrelation function 
(ACF) is 



Ql-i 



v)-«L(u-v) 



0L/ l (u + v)+QL, l (u-v) 
It can be shown that \f>Q(h)\ < 1 for all h. 



3.1.1. Influence Function and Breakdown Point. 

Influence Function (IF) is an important tool to understand the effect of the 
contamination of an outlier in any estimator. To define IF supposes that the 
empirical cumulative distribution function F n of x\, ..x n , adequately normalized, 
converges. Following Huber (2004), the influence function x — > IF(x,T,F) is 
defined for a functional T at a distribution F and at point x as the limit 

IF(x,T,F)= lim e-^TiF + e(S x - F)) - T(F)} , 

e— >0+ 

where 8 X is the Dirac distribution at x. 

Breakdown Point (BP) indicates the largest proportion of outliers that the 
data may contain such that the estimator still gives some information about the 
distribution of the outlier-free data (Maronna et al. (2006)). Rousseeuw & Croux 
(1993) showed that the asymptotic BP of Q n {-) is 50%, which means that the 
data can be contaminated by up to half of the observations with outliers and 
Q n { ) will still yield sensible estimates. 

The classical notion of sample BP of a scale estimator S n {-) is given in Defi- 
nition 3.1. 

Definition 3.1. Let 77 = (771, 772, ... , r) n )' be a sample of size 77,. Let 77 be obtained 
by replacing any m observations of 77 by arbitrary values. The sample breakdown 
point of a scale estimator S n {rj) is given by 

I 777 

e*(5„(?/)) = max < — : sup S n (77) < 00 and inf S n (rj) > > . 
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The above BP definition holds for a scale estimator function of a time invariant 
random sample. As noted by Ma & Genton (2000), in time series, the estima- 
tors are based on differences between observations apart by various time lag 
distances and usually have a BP with respect to these differences. Then, the 
time location of the outlier becomes important (see also, for example, Ledolter 
(1989)). Therefore, the authors introduced the following definition of a temporal 
sample breakdown point of an autocovariancc estimator %(/i) based on (3.1). 

Definition 3.2. Let 77 = (771 , 772, r)n)' be a sample of size n and let rj be 
obtained by replacing any m observations of 77 by arbitrary values. Denote by 
I m a subset of size m of {1,2, ... ,n}. The temporal sample breakdown point of 
an autocovariance estimator J v (h) is given by 

{in 
— : supsupS' n _; l (u + v) < oo,inf inf 5„_/j(u + v) > 0, 
n l m rj l m n 

sup sup S n -h(u — v) < 00 and inf inf 5„_/j(u — v) > > , 
i m ,7 fm n J 

where u and v are derived from 77 as in (3.3). 

Remark 1. The relation between the classical sample and the temporal sample 
breakdown points can be expressed by the following inequality (Ma & Genton 
(2000)): 

It then follows that since the sample breakdown point of the classical autocovari- 
ance estimator is zero, the temporal breakdown point of this estimator is also 
zero. This means that only one single outlier is enough to 'break' the estimator. 

Ma & Genton (2000) showed that the maximum temporal breakdown point of 
the highly robust autocovariance estimator is 25%, which is the highest possible 
breakdown point for an autocovariance estimator. 

Results of the asymptotic properties of the robust autocovariancc function for 
a Gaussian ARFIMA model are summarized as follows (see Levy-Leduc et al. 
(2011a)). 

3.1.2. Short- memory case 

Let {Xt}tez be a stationary mean-zero Gaussian process given by Model 1.1 
with d = 0, that is, the autocovariance function ("f(h) = E{X\Xh + i)) of {X t }t£Z 
satisfies 

h>l 

The following theorems present the asymptotic behavior of the robust auto- 
covariance estimator. 
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Theorem 3.1. Let h be a non-negative integer. Under the assumption that the 
autocovariances are absolutely summable, the autocovariance estimator 7q(/i, X± : 
satisfies the following Central Limit Theorem: 

Vn~ (l Q (K X 1:n , *) - j(h)) A M(0, Si), 

where 

a 2 (h)^E[^ 2 (X u X 1+h )}+2Y,EmX u X 1+h )^(X k+u X k+1+h )] (3.4) 

k>l 

where tp is a function of j(h) and of IF (see, Theorem 4 in Levy-Leduc et al. 
(2011a)). 

3.1.3. Long-memory case 

Now, let d ^ in Model 1.1 and let D = 1 - 2d. The ACF behaves like 

7(» = h- D L(h), < D < 1 , 

where L is slowly varying at infinity and is positive for large h. Note that, 
for positive d, as previously stated, the ACF of the process is not absolutely 
summable. 

Theorem 3.2. Let h be a non negative integer. Then, 7q(/i, X\. n , $) satisfies 
the following limit theorems as n tends to infinity. 

. If D> 1/2, 

V^(lQ(h,X 1 .. n ,$)-'y(h))^M(0,<T 2 (h)) , 

where 

a 2 (h) ^E[4, 2 (X 1 ,X 1+h )]+2j2n^(Xi,X 1+h )4,(X k+1 ,X k+1+h )] , 

k>l 

where ij) is a function of 7(/i) and of IF (see, Theorems 4 an d 5 in 
Levy-Leduc et al. (2011a,)). 
• If D < 1/2, 

0(D)£- (l Q (h,X lm , $) - 7^)) A 7(0) + 7(/i) ^, fl (l) - Zt D (l)) 

where /3(D) = B((l — D)/2, D), B denotes the Beta function, the processes 
Zifl(') and Z 2 ,d{') « r e defined by Equations 53 and 54, respectively, in 
Levy-Leduc et al. (2011a), and 

L{n) = 2L(n) + L(n + h)(l + h/n)~ D + L(n -h)(l- h/n)- D . (3.5) 
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Remark 2. For Model 1.1 with 1/4 < d < 1/2, the robust autocovariancc 
estimator 7<g(/i, Xi :n , $) has the same asymptotic behavior as the classical au- 
tocovariance estimator */ x (h). 

Theories related to the use of the robust ACF function to obtain an spectral 
estimate are still opened questions. However, this was first empirically inves- 
tigated by Fajardo et al. (2009). The authors considered a robust estimator of 
the spectral density based on the robust ACF function when the time series 
follows an ARFIMA Model. Their estimation method is discussed in the next 
sub-section. 



3.2. The sample spectral function 

The results discussed in the previous sections and the spectral representation of 
a stationary process justify the use of the robust ACF function in the calculus 
of an estimator of a spectral density. 

As previously stated, for the stationary process {X t }t£Z, the spectral density 
is a real- valued function of the Fourier transform of the autocovariancc function, 
that is, 

f*M = h S ^We"^ (3-6) 

h— — OG 

where jx(-) is the autocovariance of the process. 

Equation 3.6 suggests to replace Jx(-) by its estimate to obtain an estimate 
of fx{v). The periodogram function is the classical tool to estimate the spec- 
tral function. Other variants of the periodogram are called smoothed window 
periodogram (see, e.g., Priestley (1981)). In the same direction, Fajardo et al. 
(2009) suggested to use the robust autocovariance function as an estimator of 
the classical ACF to obtain a robust spectral function. Although the theoretical 
justification of this estimator is still an opened question, the authors have em- 
pirically shown that the robust spectral estimator can be an alternative method 
to estimate a time series with outliers. A robust spectral estimator is 

^q(w) = K(hyr Q (h)cos(hu), (3.7) 

|fe|<n 

where jq(h) is the sample autocovariance function given in Equation 3.3 and 
K,(h) is defined as 



K(h) 



1, \h\<M, 
0, \h\> M. 



n(h) is a particular case of the lag window functions used in classical spectral 
theory to obtain a consistent spectral estimator, and M is the truncation point 
which is a function of n, say M = G(n), where G(n) must satisfy G(n) — > oo, 
n — > co, with — > 0. G(ri) is usually chosen to be G(n) = n' 3 , where 
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< /3 < 1 (see, e.g. Priestley (1981, pp. 433-437)). Note that, equivalently to 
the classical spectral estimation theories, other different lag window functions 
can be used to obtain a robust spectral estimator. 

Since Equation 3.7 does not have the same finite-sample properties as the pe- 
riodogram, it is defined here as robust truncated pseudo-periodogram. For large 
h, the numbers of observations in the calculus of jQ{h) are very small and, con- 
sequently, this function becomes very unstable. Then, to avoid these undesirable 
covariance estimates in the calculus of the estimator given in Equation 3.7 jus- 
tify the use of a truncation point M in the calculus of this sample function (see 
Fajardo et al. (2009)). The authors suggested M that satisfies 

M < ti = mm {O < h < n : e^ rnp (j Q (hj) < — } - 1. 

4. Semiparametric estimation methods of d and empirical studies 

The semiparametric estimation procedure based on the OLS estimator proposed 
by Geweke & Porter-Hudak (1983)(GPH) is considered. Since the GPH estima- 
tor is well-discussed in the literature, this method and its asymptotic statistical 
properties are briefly summarized as follows. 

For a single realization x\, . . . , x n of {X t }t^z, the GPH estimate of d is ob- 
tained from the regression equation 

Iog/ x (wj) = oo - 2d log [2sin(wj/2)] + = 1, . . . , m! (4.1) 

where Uj is the Fourier frequency at j, m' is the bandwidth in the regression 

equation which has to satisfy ml — > oo, n — > oo. with — > and — log ^ m ) — y 

0, a = log /,(0) + log ij^- + C, = log - C and C = p(l) fo>(.) is the 

digamma function) . 

The GPH estimate of d is given by 

y~ir i (vi — v) \ogi x ((jjj) 

d GPH = (-0-5) ^ lV 3 ' xV J ' (4.2) 

where S vv = J2T=i( v j ~ «) 2 > v j = lo S { 4sm2 ( w j7 2 )}- 

Under some conditions, Hurvich et al. (1998) proved that the GPH-estimator 
is consistent for the memory parameter and asymptotically normal for Gaussian 
time series processes. The authors established that the optimal m! in Equations 

4.1 and 4.2 is of order o(n 4 / 5 ) and (™') 1/2 (4ph - d) A N(0, 

To obtain a robust estimator of d. Fajardo et al. (2009) proposed to replace 
in Equation 4.1 the logZ^Wj) by \oglQ(ujj) which gives the following OLS 
regression estimator 



d G PHR. = -(0.5) ^=^ 3 ' — , (4.3) 
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where S vv , ml are defined as before and Iq(ui) is the function given in Equation 
3.7. As previously mentioned, the asymptotical properties of dcPHR still remains 
to be established. However, based on the following empirical investigation, the 
robust method seems to be a reasonable robust alternative method to estimate 
long-memory time scries in the presence of additive outliers. 

4-1- Numerical evaluation using the ARFIMA(0, d,0) model 

The finite series were simulated from zero- mean ARFIMA models (Eq. 1.1) 
with {e t } tg z, t = l,...,n, i.i.d. N(0, 1). The models, parameters, sample sizes 
and empirical results are displayed in the following tables. The empirical mean, 
standard deviation (s.d.), bias and mean squared error (MSE) were obtained as a 
mean of 10.000 replications. The contaminated data were generated from Model 
2.6 with m = 1, p = 0.05 for magnitude w = 10 and bandwidth values for cIgph 
and dcPHR were computed for a = 0.7 and truncation point M = n?, (3 = 0.7. 
In the tables doPH a and daPHR a mean the estimates of d when the series has 
outliers. The simulations were carried out using the Ox matrix programming 
language (see http://www.doornik.com). The empirical study is divided into 
the following model properties: stationary and non-stationary processes. 

4.1.1. Stationary model 

Table 1 displays results for d = 0.3,0.45 and a = /3 = 0.7. From this it can be 
seen that when the series does not contain outliers, both estimators present sim- 
ilar behavior in the estimation of d, which is not a surprise result. However, the 
introduction of outliers in the series dramatically changes the performance of 
the classical estimator (GPH), in particular, it significantly underestimates the 
true parameter. On the other hand, in this scenario, the robust method (GPHR) 
seems to be not sensitive to outliers. Other cases were also simulated such as 
ARFIMA models with AR and MA parts and different values of p and zu. All 
cases indicated similar conclusions to the one given in Table 1. These are avail- 
able upon request. Table 2 gives the estimates of d when different lag- windows 
are used to compute the robust periodogram estimator. The lag-windows are 
Parzen (P), Tukey-Hamming(TH) and Bartlett (B) and the fractional estima- 
tors were computed with the same bandwidths as in the previous case and the 
results are in Table 2. The choice of the lag-window does not appear to be too 
important in the estimation of d since the estimates obtained from different 
lag-windows are, in general, numerically very close to each other, that is, the es- 
timates are not too sensitive to the choice of the lag- window. These lag- windows 
yield similarly accurate estimates compared to the one given in Equation 3.7. 

4-1.2. N on- stationary model 

As is well-known, the GPH estimator has been widely used even in the case 
when the ARFIMA model has d in (0.5, 1.0] (see, for example, Franco & Reisen 
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Table 1 

Simulation results: ARFIMA(0, d, 0) model with a = (3 = 0.7 and zu = 0, 10. 



d 


n 




d-GPH 


doPH c 


d-GPHR 


dGPHR c 




100 


mean 


0.2988 


0.1134 


0.2584 


0.2449 






s.d. 


0.1735 


0.1619 


0.1558 


0.1556 






bias 


-0.0012 


-0.1866 


-0.0416 


-0.0551 






MSE 


0.0301 


0.0610 


0.0260 


0.0272 




300 


mean 


0.3062 


0.1007 


0.2907 


0.2837 


0.30 




s.d. 


0.1005 


0.0978 


0.0926 


0.0960 






bias 


0.0062 


-0.1993 


-0.0093 


-0.0163 






MSE 


0.0101 


0.0493 


0.0087 


0.0095 




800 


mean 


0.3003 


0.1184 


0.2949 


0.2869 






s.d. 


0.0679 


0.0715 


0.0573 


0.0610 






bias 


0.0003 


-0.1816 


-0.0051 


-0.0131 






MSE 


0.0046 


0.0381 


0.0033 


0.0039 




100 


mean 


0.4561 


0.1923 


0.3975 


0.3778 






s.d. 


0.1722 


0.1727 


0.1506 


0.1433 






bias 


0.0061 


-0.2577 


-0.0525 


-0.0722 






MSE 


0.0297 


0.0962 


0.0254 


0.0258 




300 


mean 


0.4594 


0.2015 


0.4329 


0.4233 


0.45 




s.d. 


0.0986 


0.0976 


0.1041 


0.1013 






bias 


0.0094 


-0.2485 


-0.0171 


-0.0267 






MSE 


0.0098 


0.0713 


0.0111 


0.0110 




800 


mean 


0.4620 


0.2306 


0.4457 


0.4349 






s.d. 


0.0688 


0.0809 


0.0562 


0.0576 






bias 


0.0121 


-0.2194 


-0.0043 


-0.0151 






MSE 


0.0049 


0.0547 


0.0032 


0.0035 



(2007), Hurvich & Ray (1995), Olbermann ct al. (2006), Phillips (2007) among 
others). 

Based on the theory discussed in the previous sections, the robust method can 
not be applied in a non-stationary time series. However, it may be interesting to 
verify if GPHR estimator is invariant to the first difference, i.e. the estimative of 
memory parameter based on the original data is equal to one plus the estimated 
d based on the differenced data. 

Now, let Model 1.1 be defined with parameter d* = d + k, where d £ 
(—0.5,0.5), K > 0, k £ Z. Then, Model 1.1, with zero-mean, becomes 

X t = {l-B)- d ' Vt , t£Z. (4.4) 

Process given in Equation 4.4 is non-stationary when d* > 0.5; however, it 
is still persistent. For d* £ [0.5, 1.0) it is level-reverting in the sense that there 
is no long-run impact of an innovation on the value of the process. The level- 
reversion property no longer holds when d* > 1. Note that when d* = 1 the 
process is a random walk. 

From Model 4.4 with k = 1 and p = q = 

W t = {l-B)X t ,teZ, 

is an ARFIMA(0, d, 0) process. Let d* be the estimator of d* and d the frac- 
tional estimator obtained from the differenced data. The main goal is to verify 

imsart-generic ver. 2011/11/15 file: reisenfajardo.tex date: December 30, 2011 



V. Reisen and F. Fajardo/ Robust estimation in time series 13 
Table 2 

Empirical results of d's estimators in ARFIMA(0,d,0) model using different lag-windows. 



uricoiitaminatcd series 



Parameter 


n 




dp 


dxH 


d B 




100 


mean 
s.d. 
bias 
MSE 


0.2699 
0.1497 
-0.0301 
0.0233 


0.2602 
0.1575 
-0.0398 
0.0264 


0.2459 
0.1444 
-0.0541 
0.0238 


d = 0.3 


300 


mean 
s.d. 
bias 
MSE 


0.2880 
0.1050 
-0.0119 
0.0112 


0.2833 
0.1037 
-0.0167 
0.0110 


0.2857 
0.0976 
-0.0143 
0.0097 




800 


mean 
s.d. 
bias 
MSE 


0.2985 
0.0554 
-0.0015 
0.0031 


0.2966 
0.0584 
-0.0034 
0.0034 


0.3001 
0.0561 
0.0001 
0.0031 


contaminated series 


Parameter 


n 




dp 


dxH 


dp 




100 


mean 
s.d. 
bias 
MSE 


0.2504 
0.1552 
-0.0496 
0.0266 


0.2446 
0.1482 
-0.0554 
0.0250 


0.2419 
0.1405 
-0.0581 
0.0231 


d = 0.3 


300 


mean 
s.d. 
bias 
MSE 


0.2806 
0.1028 
-0.0194 
0.0109 


0.2729 
0.0925 
-0.0271 
0.0093 


0.2796 
0.0964 
-0.0204 
0.0097 




800 


mean 
s.d. 
bias 
MSE 


0.2934 
0.0578 
-0.0066 
0.0034 


0.2889 
0.0606 
-0.0111 
0.0038 


0.2928 
0.0553 
-0.0072 
0.0031 



the equality d* = d + 1 for uncontaminatcd and contaminated scries. Based on 
the same simulation procedure previously described, series from Model 4.4 were 
generated and some cases are displayed in Table 3 (other cases are available 
upon request). Similar conclusions to the previous study are observed. Both 
estimators present equivalent performance when they are applied in the first 
difference of uncontaminatcd scries. This suggests that both can be used in 
practical situations when dealing with non-stationary data. However, since the 
first difference does not eliminate the effect of an outlier, the estimates clearly 
indicate that caution has to be exercised when there is suspicion of outliers in 
the data. The GPH estimator presents poor performance in terms of bias (high 
positive bias) and MSE. In contrast to the GPH estimator, the GPHR method 
seems to be invariant to the first difference of non-stationary time series with 
outliers. This empirical study suggests that, in practical situations when dealing 
with non-stationary data with outliers, one solution is to apply the first differ- 
ence in the series and then to estimate d with the robust estimator discussed in 
this paper. 
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Table 3 

Empirical results: ARFIMA(0, d, 0) model with differenced data and ui = 0, 10. 



Parameter 


n 




d-GPH 


d.GPH c 


damn 


dGPHR c 






300 


mean 


-0.2141 


-0.5066 


-0.1906 


-0.2211 


d x = 


= 0.8, d w = -0.2 




bias 


0.0141 


0.3066 


-0.0094 


0.0211 








s.d 


0.1076 


0.1469 


0.1127 


0.1421 








MSE 


0.0118 


0.1155 


0.0128 


0.0206 






800 


mean 


-0.1906 


-0.4283 


-0.2062 


-0.2250 








bias 


-0.0094 


0.2283 


0.0062 


0.0251 








s.d 


0.0630 


0.0883 


0.0851 


0.1081 








MSE 


0.0041 


0.0599 


0.0073 


0.0123 






100 


mean 


-0.0048 


-0.4166 


-0.0449 


-0.0871 








bias 


0.0048 


0.4166 


0.0449 


0.0871 








s.d 


0.1763 


0.2215 


0.1620 


0.1811 








MSE 


0.0311 


0.2226 


0.0283 


0.0404 






300 


mean 


-0.0122 


-0.3230 


-0.0273 


-0.0426 


Ax 


= 1.0, dw = o.o 




bias 


0.0122 


0.3230 


0.0273 


0.0426 








s.d 


0.1076 


0.1296 


0.1094 


0.1277 








MSE 


0.0117 


0.1211 


0.0127 


0.0181 






800 


mean 


0.0059 


-0.2181 


-0.0107 


-0.0222 








bias 


-0.0059 


0.2181 


0.0107 


0.0222 








s.d 


0.0648 


0.0823 


0.0629 


0.0909 








MSE 


0.0042 


0.0544 


0.0041 


0.0088 



5. Application 

IGP-DI is the general price index with domestic availability and is calculated 
by Fundagao Getulio Vargas, Brazil. The scries comprises monthly observations 
from August 1994 to April 2011 (total of 201 observations). The series and its 
ACF are displayed in Figure 1. The observations of the months February 1999 
(4.44%), October 2002 (4.21%) and November 2002 (5.84%) are possibly out- 
liers. Looking at the plots in Figure 1, these suggest that the series is stationary 
and possess long-memory behavior. From the data and using the methodologies 
previously discussed, the parameter d is estimated and the estimates are dis- 
played in Table 5. For this application, the estimates d are computed from the 
original data (OD) and from the modified data (MD) where the observations of 
February 1999, October 2002 and November 2002 are replaced by the sample 
mean of the series. This analysis is a simple exercise to verify the robustness of 
the estimators in a real application and, also, to investigate whether the data 
contains outliers. The d' estimates of OD and MD series are given, respectively, 
on the left and the right side of Table 5. These estimates were calculated using 
different bandwidths in Equation 4.2(m' = n a ) and j3 was fixed as in the sim- 
ulation study. In both series, for a fixed a, the robust methods present similar 
results. The estimates maintain the same empirical property across the band- 
width values. In contrast to the robust methods, the classical GPH estimator 
gives estimates that dramatically change from OD to MD data showing that 
the observations replaced by the mean are possible atypical data. 
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Fig 1. IGP-DI series and its sample autocorrelation function: period from Aug/94 to Apr/11. 



6. Concluding remarks and future direction 

This paper investigates the effect of outliers in the estimation of the fractional 
parameter d in the ARFIMA(p, d, q) model and, also, discusses the asymptotical 
and empirical properties of the robust autocovariance and spectral estimators, 
previously given in Fajardo et al. (2009) and Levy-Leduc et al. (2011a), for the 
case of time series with short and long-memory properties. These studies support 
the use of the robust estimators to estimate the long-memory parameter when 
Gaussian long-memory time series are contaminated with additive outliers. Non- 
stationary time series with outliers are also studied and the investigation reveals 



Table 4 

Estimates of d: IGP-DI data, period from Aug/94 t° Apr/11. 





Original time series 


Modified time series 


Estimator 


a = 0.5 


a = 0.6 


a = 0.7 


a = 0.8 


a = 0.5 


a = 0.6 


a = 0.7 


a = 0.8 


dGPH 


0.0757 


0.1205 


0.3431 


0.3759 


0.3110 


0.3116 


0.3713 


0.3875 




(0.3417) 


(0.1869) 


(0.1389) 


(0.0888) 


(0.1586) 


(0.1077) 


(0.0909) 


(0.0683) 


d-GPHRp 


0.1802 


0.2335 


0.2269 


0.2397 


0.1630 


0.2077 


0.2078 


0.2230 




(0.0857) 


(0.0745) 


(0.0469) 


(0.0331) 


(0.0782) 


(0.0603) 


(0.0385) 


(0.0251) 


dGPHR TH 


0.1718 


0.1919 


0.2125 


0.2379 


0.1545 


0.1782 


0.1968 


0.2231 


(0.0742) 


(0.0508) 


(0.0303) 


(0.0210) 


(0.0673) 


(0.0436) 


(0.0259) 


(0.0170) 


dGPHR B 


0.1522 


0.1788 


0.2047 


0.2327 


0.1379 


0.1667 


0.1896 


0.2181 




(0.0641) 


(0.0433) 


(0.0262) 


(0.0183) 


(0.0586) 


(0.0378) 


(0.0227) 


(0.0151) 


dGPHR 


0.1662 


0.2628 


0.2454 


0.2285 


0.1500 


0.2211 


0.2215 


0.2228 




(0.0862) 


(0.0995) 


(0.0671) 


(0.0436) 


(0.0794) 


(0.0717) 


(0.0511) 


(0.0328) 
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that the robust method can be used as an alternative estimation procedure in 
time series with fractional differences. As previously stated, the asymptotical 
properties of the robust estimator under the study still remain to be investigated. 
The robust ACF method discussed here has also been used in other contexts 
such as in the estimation of periodic process (Sarnaglia et al. (2010)) and in 
seasonal ARFIMA processes (this is one of the current research of the authors) . 
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